Properties of the n-overlap vector and n-overlap similarity theory
نویسنده
چکیده
In the first part of this paper we define the n-overlap vector whose coordinates consist of the fraction of the objects (e.g. books, N-grams,...) that belong to 1, 2,..., n sets (more generally: families) (e.g. libraries, databases,...). With the aid of the Lorenz concentration theory we build a theory of n-overlap similarity and corresponding measures, such as the generalized Jaccard index (generalizing the well-known Jaccard index in case n 2 = ). Next we determine the distributional form of the n-overlap vector assuming certain distributions of the object’s and of the set (family)-sizes. In this section the decreasing power law and decreasing exponential distribution is explained for the n-overlap vector. Both item (token) n-overlap and source (type) n-overlap are studied. 1 Permanent address
منابع مشابه
A Nested-Splicing by Overlap Extension PCR Improves Specificity of this Standard Method
Background: Splicing by overlap extension (SOE) PCR is used to create mutation in the coding sequence of an enzyme in order to study the role of specific residues in protein’s structure and function. Objectives: We introduced a nested-SOE-PCR (N –SOE-PCR) in order to increase the specificity and generating mutations in a gene by SOE-PCR. Materials and Methods: Genomic DNA from Bacillus thermo...
متن کاملFirst-principles study on the electronic structure of Thiophenbithiol (TBT) on Au(100) surface
First principle calculations were performed using Density functional theory within the local spin density approximation (LSDA) to understand the electronic properties of Au(100)+TBT system and compare the results with Au(100) and bulk Au properties. Band structure, the total DOS and charge density for these materials are calculated. We found that the HOMO for Au(100)+TBT becomes broader than Au...
متن کاملFirst-principles study on the electronic structure of Thiophenbithiol (TBT) on Au(100) surface
First principle calculations were performed using Density functional theory within the local spin density approximation (LSDA) to understand the electronic properties of Au(100)+TBT system and compare the results with Au(100) and bulk Au properties. Band structure, the total DOS and charge density for these materials are calculated. We found that the HOMO for Au(100)+TBT becomes broader than Au...
متن کاملOverlap-based feature weighting: The feature extraction of Hyperspectral remote sensing imagery
Hyperspectral sensors provide a large number of spectral bands. This massive and complex data structure of hyperspectral images presents a challenge to traditional data processing techniques. Therefore, reducing the dimensionality of hyperspectral images without losing important information is a very important issue for the remote sensing community. We propose to use overlap-based feature weigh...
متن کاملDual Promoter Vector Construction for Simultaneous Gene Expression Using Spliced Overlap Extension by Polymerase Chain Reaction (SOE-PCR) Technique
There are two different co-expression systems including bicistronic; dual-vector or two-promoter to express two different genes simultaneously and also to study protein-protein interactions. Bicistronic system has disadvantages e.g. compared with two-promoter system. In this paper, a simple method based on spliced overlap extension by polymerase chain reaction (SOE-PCR) technique was demonstrat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- JASIST
دوره 57 شماره
صفحات -
تاریخ انتشار 2006